An Algorithm for Word-Level Alignment of Parallel Dependency Trees1
نویسندگان
چکیده
Structural divergence presents a challenge to the use of syntax in statistical machine translation. We address this problem with a new algorithm for alignment of loosely matched non-isomorphic dependency trees. The algorithm selectively relaxes the constraints of the two tree structures while keeping computational complexity polynomial in the length of the sentences. Experimentation with a large Chinese-English corpus shows an improvement in alignment results over the unstructured models of (Brown et al., 1993).
منابع مشابه
Improving Word Alignment Using Alignment of Deep Structures
In this paper, we describe differences between a classical word alignment on the surface (word-layer alignment) and an alignment of deep syntactic sentence representations (tectogrammatical alignment). The deep structures we use are dependency trees containing content (autosemantic) words as their nodes. Most of other functional words, such as prepositions, articles, and auxiliary verbs are hid...
متن کاملClassification Of Semantic Relations By Humans And Machines
This paper addresses the classification of semantic relations between pairs of sentences extracted from a Dutch parallel corpus at the word, phrase and sentence level. We first investigate the performance of human annotators on the task of manually aligning dependency analyses of the respective sentences and of assigning one of five semantic relations to the aligned phrases (equals, generalizes...
متن کاملTowards a Weighted Induction Method of Dependency Annotation
This paper presents a method of annotating sentences with dependency trees which is set within the mainstream of the study on dependency projection. The approach builds on the idea of weighted projection. However, we involve a weighting factor not only in the process of projecting dependency relations (weighted projection) but also in the process of acquiring dependency trees from projected set...
متن کاملAutomatic Learning of Parallel Dependency Treelet Pairs
Induction of synchronous grammars from empirical data has long been a problem unsolved; despite that generative synchronous grammars theoretically suit the machine translation task very well. This fact is mainly due to pervasive structural divergences between languages. This paper presents a statistical approach to learn dependency structure mappings from parallel corpora. The algorithm introdu...
متن کاملProjection-based Annotation of a Polish Dependency Treebank
This paper presents an approach of automatic annotation of sentences with dependency structures. The approach builds on the idea of cross-lingual dependency projection. The presented method of acquiring dependency trees involves a weighting factor in the processes of projecting source dependency relations to target sentences and inducing well-formed target dependency trees from sets of projecte...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003